AITopics | residual norm

Collaborating Authors

residual norm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sublinear Time Orthogonal Tensor Decomposition

Zhao Song, David Woodruff, Huan Zhang

Neural Information Processing SystemsMar-23-2026, 04:07:04 GMT

Their algorithm is based on computing sketches of the input tensor, which requires reading the entire input. We show in a number of cases one can achieve the same theoretical guarantees in sublinear time, i.e., even without reading most of the input tensor. Instead of using sketches to estimate inner products in tensor decomposition algorithms, we use importance sampling. To achieve sublinear time, we need to know the norms of tensor slices, and we show how to do this in a number of important cases.

artificial intelligence, machine learning, tensor, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)

Add feedback

Improving Linear System Solvers for Hyperparameter Optimisation in Iterative Gaussian Processes

Neural Information Processing SystemsFeb-8-2026, 22:12:41 GMT

Scaling hyperparameter optimisation to very large datasets remains an open problem in the Gaussian process community.

artificial intelligence, linear system solver, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Understanding the Implicit Regularization of Gradient Descent in Over-parameterized Models

Ma, Jianhao, Liang, Geyu, Fattahi, Salar

arXiv.org Artificial IntelligenceDec-10-2025

Implicit regularization refers to the tendency of local search algorithms to converge to low-dimensional solutions, even when such structures are not explicitly enforced. Despite its ubiquity, the mechanism underlying this behavior remains poorly understood, particularly in over-parameterized settings. We analyze gradient descent dynamics and identify three conditions under which it converges to second-order stationary points within an implicit low-dimensional region: (i) suitable initialization, (ii) efficient escape from saddle points, and (iii) sustained proximity to the region. We show that these can be achieved through infinitesimal perturbations and a small deviation rate. Building on this, we introduce Infinitesimally Perturbed Gradient Descent (IPGD), which satisfies these conditions under mild assumptions. We provide theoretical guarantees for IPGD in over-parameterized matrix sensing and empirical evidence of its broader applicability.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Artificial Intelligence

2505.17304

Country: North America > United States (0.45)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Improving Linear System Solvers for Hyperparameter Optimisation in Iterative Gaussian Processes

Neural Information Processing SystemsOct-9-2025, 20:11:40 GMT

Scaling hyperparameter optimisation to very large datasets remains an open problem in the Gaussian process community.

dataset, estimator, linear system solver, (12 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Scalable Gaussian Processes: Advances in Iterative Methods and Pathwise Conditioning

Lin, Jihao Andreas

arXiv.org Machine LearningJul-10-2025

Gaussian processes are a powerful framework for uncertainty-aware function approximation and sequential decision-making. Unfortunately, their classical formulation does not scale gracefully to large amounts of data and modern hardware for massively-parallel computation, prompting many researchers to develop techniques which improve their scalability. This dissertation focuses on the powerful combination of iterative methods and pathwise conditioning to develop methodological contributions which facilitate the use of Gaussian processes in modern large-scale settings. By combining these two techniques synergistically, expensive computations are expressed as solutions to systems of linear equations and obtained by leveraging iterative linear system solvers. This drastically reduces memory requirements, facilitating application to significantly larger amounts of data, and introduces matrix multiplication as the main computational operation, which is ideal for modern hardware.

artificial intelligence, machine learning, neural information processing system, (16 more...)

arXiv.org Machine Learning

2507.06839

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Russia (0.04)
Europe > Denmark (0.04)
(2 more...)

Genre:

Workflow (0.92)
Summary/Review (0.88)
Research Report (0.81)

Industry:

Education (0.67)
Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Modeling & Simulation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

Add feedback

Residual Random Neural Networks

Andrecut, M.

arXiv.org Artificial IntelligenceNov-4-2024

The single-layer feedforward neural network with random weights is a recurring motif in the neural networks literature. The advantage of these networks is their simplified training, which reduces to solving a ridge-regression problem. A general assumption is that these networks require a large number of hidden neurons relative to the dimensionality of the data samples, in order to achieve good classification accuracy. Contrary to this assumption, here we show that one can obtain good classification results even if the number of hidden neurons has the same order of magnitude as the dimensionality of the data samples, if this dimensionality is reasonably high. Inspired by this result, we also develop an efficient iterative residual training method for such random neural networks, and we extend the algorithm to the least-squares kernel version of the neural network model. Moreover, we also describe an encryption (obfuscation) method which can be used to protect both the data and the resulted network model.

accuracy, dimensionality, matrix, (14 more...)

arXiv.org Artificial Intelligence

2410.19987

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > The Hague (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.57)

Add feedback

Improving Linear System Solvers for Hyperparameter Optimisation in Iterative Gaussian Processes

Lin, Jihao Andreas, Padhy, Shreyas, Mlodozeniec, Bruno, Antorán, Javier, Hernández-Lobato, José Miguel

arXiv.org Machine LearningJun-6-2024

Scaling hyperparameter optimisation to very large datasets remains an open problem in the Gaussian process community. This paper focuses on iterative methods, which use linear system solvers, like conjugate gradients, alternating projections or stochastic gradient descent, to construct an estimate of the marginal likelihood gradient. We discuss three key improvements which are applicable across solvers: (i) a pathwise gradient estimator, which reduces the required number of solver iterations and amortises the computational cost of making predictions, (ii) warm starting linear system solvers with the solution from the previous step, which leads to faster solver convergence at the cost of negligible bias, (iii) early stopping linear system solvers after a limited computational budget, which synergises with warm starting, allowing solver progress to accumulate over multiple marginal likelihood steps. These techniques provide speed-ups of up to $72\times$ when solving to tolerance, and decrease the average residual norm by up to $7\times$ when stopping early.

estimator, linear system solver, residual norm, (12 more...)

arXiv.org Machine Learning

2405.18457

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems

Chen, Jie

arXiv.org Artificial IntelligenceJun-2-2024

Preconditioning is at the heart of iterative solutions of large, sparse linear systems of equations in scientific disciplines. Several algebraic approaches, which access no information beyond the matrix itself, are widely studied and used, but ill-conditioned matrices remain very challenging. We take a machine learning approach and propose using graph neural networks as a general-purpose preconditioner. They show attractive performance for ill-conditioned problems, in part because they better approximate the matrix inverse from appropriately generated training data. Empirical evaluation on over 800 matrices suggests that the construction time of these graph neural preconditioners (GNPs) is more predictable than other widely used ones, such as ILU and AMG, while the execution time is faster than using a Krylov method as the preconditioner, such as in inner-outer GM-RES. GNPs have a strong potential for solving large-scale, challenging algebraic problems arising from not only partial differential equations, but also economics, statistics, graph, and optimization, to name a few.

matrix, neural network, preconditioner, (16 more...)

arXiv.org Artificial Intelligence

2406.00809

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.40)

Industry: Construction & Engineering (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Sublinear Time Orthogonal Tensor Decomposition ∗ ⋆ ‡ Dept. of Computer Science, University of Texas, Austin, USA

Neural Information Processing SystemsApr-10-2023, 07:51:48 GMT

algorithm, decomposition, tensor, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.40)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
North America > United States > California > Yolo County > Davis (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)

Add feedback

NFFT meets Krylov methods: Fast matrix-vector products for the graph Laplacian of fully connected networks

Alfke, Dominik, Potts, Daniel, Stoll, Martin, Volkmer, Toni

arXiv.org Machine LearningAug-14-2018

The graph Laplacian is a standard tool in data science, machine learning, and image processing. The corresponding matrix inherits the complex structure of the underlying network and is in certain applications densely populated. This makes computations, in particular matrix-vector products, with the graph Laplacian a hard task. A typical application is the computation of a number of its eigenvalues and eigenvectors. Standard methods become infeasible as the number of nodes in the graph is too large. We propose the use of the fast summation based on the nonequispaced fast Fourier transform (NFFT) to perform the dense matrix-vector product with the graph Laplacian fast without ever forming the whole matrix. The enormous flexibility of the NFFT algorithm allows us to embed the accelerated multiplication into Lanczos-based eigenvalues routines or iterative linear system solvers. We illustrate the feasibility of our approach on a number of test problems from image segmentation to semi-supervised learning based on graph-based PDEs. In particular, we compare our approach with the Nystr\"om method. Moreover, we present and test an enhanced, hybrid version of the Nystr\"om method, which internally uses the NFFT.

artificial intelligence, machine learning, matrix-vector product, (13 more...)

arXiv.org Machine Learning

1808.0458

Country:

Europe > Germany (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback